Control Problem
8 pages tagged "Control Problem"
Could we program an AI to automatically shut down?
Can you stop an advanced AI from upgrading itself?
Can we test an AI to make sure it won't misbehave if it becomes superintelligent?
Can we constrain a goal-directed AI using specified rules?
Why can’t we just use Asimov’s Three Laws of Robotics?
Why can't we just turn the AI off if it starts to misbehave?
What is interpretability and what approaches are there?
What is a “treacherous turn”?